Data Refining for Text Mining Process in Aviation Safety Data
نویسنده
چکیده
Successful data mining is an iterative process during which data will be refined and adjusted to achieve more accurate mining results. Most important tools in the text mining context are list of stop words and list of synonyms. The size and richness of the lists mentioned depend on the structure of the language used in the text to be mined. English, for example, is an “easy” language for search technologies, because with a couple of exceptions, the stem of the word is not conjugated and terms are formed using several words instead of creating compounds. This requires special attention to definitions when processing morphologically rich languages like Finnish. This chapter introduces the need and realisation of refining the source data for a successful data mining process based onto the results achieved from first mining round.
منابع مشابه
International Journal of Occupational Safety and Ergonomics – rok 2007, ročník 13
The main objective of this study was to analyze anomalies voluntarily reported by pilots in civil aviation sector and identify factors leading to such anomalies. Experimental data were obtained from the NASA aviation safety reporting system (ASRS) database. These data contained a range of text records spanning 30 years of civilian aviation, both commercial (airline operations) and general aviat...
متن کاملImproving Performance of Classification Models with Textual Data
The main objective in this study is to measure the effect of unstructured text on classification performance. A large dataset of aviation incidents reports was used in this study. In aviation incidents the proportion attributable to human factors is close to 90%. Therefore accurate identification of the presence of human factors in past aviation incidents is critical to improving aviation safet...
متن کاملThe identification of factors contributing to self-reported anomalies in civil aviation.
The main objective of this study was to analyze anomalies voluntarily reported by pilots in civil aviation sector and identify factors leading to such anomalies. Experimental data were obtained from the NASA aviation safety reporting system (ASRS) database. These data contained a range of text records spanning 30 years of civilian aviation, both commercial (airline operations) and general aviat...
متن کاملارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متنکاوی در حوزه یادگیری الکترونیکی
As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...
متن کاملApplication of Provalis Research Corp.’s Statistical Content Analysis Text Mining to Airline Safety Reports
All data and information in this document are provided " as is, " without any expressed or implied warranty of any kind, including as to the accuracy, completeness, currentness, non-infringement, merchantability, or fitness for any purpose. The views and opinions expressed in this document do not necessarily reflect those of the Global Aviation Information Network or any of its participants, ex...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009